The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
In this paper, we consider the inventory management (IM) problem where we need to make replenishment decisions for a large number of stock keeping units (SKUs) to balance their supply and demand. In our setting, the constraint on the shared resources (such as the inventory capacity) couples the otherwise independent control for each SKU. We formulate the problem with this structure as Shared-Resource Stochastic Game (SRSG)and propose an efficient algorithm called Context-aware Decentralized PPO (CD-PPO). Through extensive experiments, we demonstrate that CD-PPO can accelerate the learning procedure compared with standard MARL algorithms.
translated by 谷歌翻译
Minimum Bayesian Risk Decoding (MBR) emerges as a promising decoding algorithm in Neural Machine Translation. However, MBR performs poorly with label smoothing, which is surprising as label smoothing provides decent improvement with beam search and improves generality in various tasks. In this work, we show that the issue arises from the un-consistency of label smoothing on the token-level and sequence-level distributions. We demonstrate that even though label smoothing only causes a slight change in the token-level, the sequence-level distribution is highly skewed. We coin the issue \emph{distributional over-smoothness}. To address this issue, we propose a simple and effective method, Distributional Cooling MBR (DC-MBR), which manipulates the entropy of output distributions by tuning down the Softmax temperature. We theoretically prove the equivalence between pre-tuning label smoothing factor and distributional cooling. Experiments on NMT benchmarks validate that distributional cooling improves MBR's efficiency and effectiveness in various settings.
translated by 谷歌翻译
Partial MaxSAT (PMS) and Weighted PMS (WPMS) are two practical generalizations of the MaxSAT problem. In this paper, we propose a local search algorithm for these problems, called BandHS, which applies two multi-armed bandits to guide the search directions when escaping local optima. One bandit is combined with all the soft clauses to help the algorithm select to satisfy appropriate soft clauses, and the other bandit with all the literals in hard clauses to help the algorithm select appropriate literals to satisfy the hard clauses. These two bandits can improve the algorithm's search ability in both feasible and infeasible solution spaces. We further propose an initialization method for (W)PMS that prioritizes both unit and binary clauses when producing the initial solutions. Extensive experiments demonstrate the excellent performance and generalization capability of our proposed methods, that greatly boost the state-of-the-art local search algorithm, SATLike3.0, and the state-of-the-art SAT-based incomplete solver, NuWLS-c.
translated by 谷歌翻译
This paper is about an extraordinary phenomenon. Suppose we don't use any low-light images as training data, can we enhance a low-light image by deep learning? Obviously, current methods cannot do this, since deep neural networks require to train their scads of parameters using copious amounts of training data, especially task-related data. In this paper, we show that in the context of fundamental deep learning, it is possible to enhance a low-light image without any task-related training data. Technically, we propose a new, magical, effective and efficient method, termed \underline{Noi}se \underline{SE}lf-\underline{R}egression (NoiSER), which learns a gray-world mapping from Gaussian distribution for low-light image enhancement (LLIE). Specifically, a self-regression model is built as a carrier to learn a gray-world mapping during training, which is performed by simply iteratively feeding random noise. During inference, a low-light image is directly fed into the learned mapping to yield a normal-light one. Extensive experiments show that our NoiSER is highly competitive to current task-related data based LLIE models in terms of quantitative and visual results, while outperforming them in terms of the number of parameters, training time and inference speed. With only about 1K parameters, NoiSER realizes about 1 minute for training and 1.2 ms for inference with 600$\times$400 resolution on RTX 2080 Ti. Besides, NoiSER has an inborn automated exposure suppression capability and can automatically adjust too bright or too dark, without additional manipulations.
translated by 谷歌翻译
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
现有检测方法通常使用参数化边界框(Bbox)进行建模和检测(水平)对象,并将其他旋转角参数用于旋转对象。我们认为,这种机制在建立有效的旋转检测回归损失方面具有根本的局限性,尤其是对于高精度检测而言,高精度检测(例如0.75)。取而代之的是,我们建议将旋转的对象建模为高斯分布。一个直接的优势是,我们关于两个高斯人之间距离的新回归损失,例如kullback-leibler Divergence(KLD)可以很好地对齐实际检测性能度量标准,这在现有方法中无法很好地解决。此外,两个瓶颈,即边界不连续性和正方形的问题也消失了。我们还提出了一种有效的基于高斯度量的标签分配策略,以进一步提高性能。有趣的是,通过在基于高斯的KLD损失下分析Bbox参数的梯度,我们表明这些参数通过可解释的物理意义进行了动态更新,这有助于解释我们方法的有效性,尤其是对于高精度检测。我们使用量身定制的算法设计将方法从2-D扩展到3-D,以处理标题估计,并在十二个公共数据集(2-D/3-D,空中/文本/脸部图像)上进行了各种基本检测器的实验结果。展示其优越性。
translated by 谷歌翻译
本文回顾了AIM 2022上压缩图像和视频超级分辨率的挑战。这项挑战包括两条曲目。轨道1的目标是压缩图像的超分辨率,轨迹〜2靶向压缩视频的超分辨率。在轨道1中,我们使用流行的数据集DIV2K作为培训,验证和测试集。在轨道2中,我们提出了LDV 3.0数据集,其中包含365个视频,包括LDV 2.0数据集(335个视频)和30个其他视频。在这一挑战中,有12支球队和2支球队分别提交了赛道1和赛道2的最终结果。所提出的方法和解决方案衡量了压缩图像和视频上超分辨率的最先进。提出的LDV 3.0数据集可在https://github.com/renyang-home/ldv_dataset上找到。此挑战的首页是在https://github.com/renyang-home/aim22_compresssr。
translated by 谷歌翻译
与传统方法相比,学到的图像压缩已在PSNR和MS-SSIM中取得了非凡的速率延伸性能。但是,它遭受了密集的计算,这对于现实世界的应用是无法忍受的,目前导致其工业应用有限。在本文中,我们将神经体系结构搜索(NAS)介绍到具有较低延迟的更有效网络,并利用量化以加速推理过程。同时,已经为提高效率而做出了工程努力。使用PSNR和MS-SSIM的混合损失以更好的视觉质量进行了优化,我们获得的MSSIM比JPEG,JPEG XL和AVIF在所有比特率上都高得多,而JPEG XL和AVIF之间的PSNR则获得了PSNR。与JPEG-Turbo相比,我们的LIC的软件实施实现了可比较甚至更快的推理速度,而多次比JPEG XL和AVIF快。此外,我们的LIC实施达到了145 fps的惊人吞吐量,用于编码为208 fps,用于在Tesla T4 GPU上解码1080p图像。在CPU上,我们实施的延迟与JPEG XL相当。
translated by 谷歌翻译
旅行推销员问题(TSP)是许多实用变体的经典NP-HARD组合优化问题。 Lin-Kernighan-Helsgaun(LKH)算法是TSP的最先进的本地搜索算法之一,LKH-3是LKH的强大扩展,可以解决许多TSP变体。 LKH和LKH-3都将一个候选人与每个城市相关联,以提高算法效率,并具有两种不同的方法,称为$ \ alpha $ - 计算和Popmusic,以决定候选人集。在这项工作中,我们首先提出了一种可变策略加强LKH(VSR-LKH)算法,该算法将三种强化学习方法(Q-Learning,SARSA和Monte Carlo)与LKH算法结合在一起,以解决TSP。我们进一步提出了一种称为VSR-LKH-3的新算法,该算法将可变策略强化学习方法与LKH-3结合在一起,用于典型的TSP变体,包括带有时间窗口(TSPTW)和彩色TSP(CTSP)的TSP。所提出的算法取代了LKH和LKH-3中的不灵活的遍历操作,并让算法学会通过增强学习在每个搜索步骤中做出选择。 LKH和LKH-3都具有$ \ alpha $量或Popmusic方法,我们的方法都可以显着改善。具体而言,对236个公共和广泛使用的TSP基准的经验结果具有多达85,900个城市,证明了VSR-LKH的出色表现,扩展的VSR-LKH-3也显着超过了TSPTW和TSPTW和TSPTW和TSPTW的最新启发式方法CTSP。
translated by 谷歌翻译